Building an English-iraqi Arabic machine translation system for spoken utterances with limited resources
نویسندگان
چکیده
This paper presents an English-Iraqi Arabic speech-to-speech statistical machine translation system using limited resources. In it, we explore the constraints involved, how we endeavored to mitigate such problems as a non-standard orthography and a highly inflected grammar, and discuss leveraging existing plentiful resources for Modern Standard Arabic to assist in this task. These combined techniques yield a reduction in unknown words at translation time by over 40% and a +3.65 increase in BLEU score over a previous state-of-the-art system using the same parallel training corpus of spoken utterances.
منابع مشابه
Morphological Modeling for Machine Translation of English-Iraqi Arabic Spoken Dialogs
This paper addresses the problem of morphological modeling in statistical speech-tospeech translation for English to Iraqi Arabic. An analysis of user data from a real-time MT-based dialog system showed that generating correct verbal inflections is a key problem for this language pair. We approach this problem by enriching the training data with morphological information derived from sourceside...
متن کاملColloquial Iraqi ASR for speech translation
In this paper we describe a real-time speech recognition system developed for colloquial Iraqi Arabic. This system is currently used in our speech-to-speech translation system configured for bi-directional communication in English and Iraqi on a laptop. We present experimental results on Iraqi utterances from different speech-to-speech translation domains, and analyze the usefulness of acoustic...
متن کاملThe BBN 2007 displayless English/iraqi speech-to-speech translation system
Spoken communication across a language barrier is of increasing importance in both civilian and military applications. In this paper, we present an English/Iraqi Arabic speech-to-speech translation system for the military force protection domain (checkpoints, municipal services surveys, basic descriptions of people, houses, vehicles, etc). The system combines statistical N-gram speech recogniti...
متن کاملImprovements in machine translation for English/iraqi speech translation
In this paper, we describe techniques for improving machine translation quality in the context of speech-to-speech translation for significantly different language pairs. Specifically, we explore three broad approaches for improving translation from English to Iraqi and vice versa. First, we investigate normalization techniques which address the differences in spoken and written forms of both l...
متن کاملA Hybrid Phrase-based/Statistical
Spoken communication across a language barrier is of increasing importance in both civilian and military applications. In this paper, we present a system for taskdirected 2-way communication between speakers of English and Iraqi colloquial Arabic. The application domain of the system is force protection. The system supports translingual dialogue in areas that include municipal services surveys,...
متن کامل